Goto

Collaborating Authors

 model-parallel execution


The streaming rollout of deep networks - towards fully model-parallel execution

Neural Information Processing Systems

Deep neural networks, and in particular recurrent networks, are promising candidates to control autonomous agents that interact in real-time with the physical world. However, this requires a seamless integration of temporal features into the network's architecture. For the training of and inference with recurrent neural networks, they are usually rolled out over time, and different rollouts exist.


Reviews: The streaming rollout of deep networks - towards fully model-parallel execution

Neural Information Processing Systems

A main motivation is to increase the efficiency (e.g., response time) of the network during training/inference. A rollout is a graph that captures the functional dependency of network nodes over time. The authors argue that there are different possible rollouts that have different quality (e.g., response time), introduce mathematical definitions to describe rollouts (e.g., validity / model-parallelizable) and analyze rollouts theoretically and experimentally. In my understanding, the actual conclusion of the paper seems to be: the streaming ("R \equiv 1") is the best, e.g., Theorem in L192 states that the streaming rollout achieves the lowest response time over the entire graph. The experiments seem to support that conclusion. Note that how to obtain the streaming rollout is not clearly stated by the authors, although the Thm in L192 seems to suggest a working rule for obtaining it. Pro: Originality/Significance: - I'm not aware of earlier work that analyzes this low-level implementation issue, but it is worthwhile to analyze this for optimization purposes.


The streaming rollout of deep networks - towards fully model-parallel execution

Fischer, Volker, Koehler, Jan, Pfeil, Thomas

Neural Information Processing Systems

Deep neural networks, and in particular recurrent networks, are promising candidates to control autonomous agents that interact in real-time with the physical world. However, this requires a seamless integration of temporal features into the network's architecture. For the training of and inference with recurrent neural networks, they are usually rolled out over time, and different rollouts exist. In this study, we present a theoretical framework to describe rollouts, the level of model-parallelization they induce, and demonstrate differences in solving specific tasks. We prove that certain rollouts, also for networks with only skip and no recurrent connections, enable earlier and more frequent responses, and show empirically that these early responses have better performance.